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ABSTRACT 


\ Resources in Education, the Current Index to Journals 
,in Bducation, and Psychological Abstracts were computer searched in 
order to identify See and journal articles which describe the 
derivation, use, and sisuse of grade equivalent scores, those scores 
which reflect a student's performance in a-test.or battery of tests 
according to grade norms. Each of the 23 references is abstracted and 
a subject, index'\is provided. (EVH) ; 
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* Docusents acquired by ERIC include many,inforaal unpyblished * 
* materials not available from other sources. ERIC makes every effort * 
* to obtain the best copy available. Nevertheless, items of marginal * 
* reproducibility are often encountered and this affects the quality * 
* of the sicrcfiche and hardcopy reproductions ERIC makes available * 
* via the ERIC Locument Reproduction Service (EDRS). EDRS is not _ 
* responsible fon the quality of the original document. Reproductions * 
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supplied by EDRS are the best that can be made frog_the original. 
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/the bdvcat tonal Banco Nan Center (ERIC) is. ‘operated ‘By tie/ 
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in 

| { ai 

\ Education, and Wel ais it tai an ‘information | system dedicated to ‘me im-, oe a 
oy) a 


\ Seveiet of atiketiod chtough, the, divveatnvedon of conference proceedings,» 


\ 
"aber tga programs, manuals, pobition papers, Frcgran descriptions, 


\National. ae ees of the! United States. Departuent of Hedlch, : 


\ 


rpaepres and pectatoat reports, Literature Teviews, and other types of :- 
} oe 
bere, infor-" 


“material. ERIC aids school sadministrators, “taaleens researc 
\ 1/ 


Y nat fon, specialists, professional organizations, students, and others in 
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locating and using information which was previously unpub ish | or-which ‘ 
‘ \ ( , 
Mould, not. be wetety disseminated othervise. 


, . 
. y . 


The ERIC Clearinghouse on Tests, Measurement, and’ Evaluation cay /7M) 


acquires and processes documents and, journal articles =e the Scope a 


’ 


interest of thé Clearinghouse for announcement in ERIC's halve and abstract 


journals: — in Education (RIE) and Current Index to Jourpals in 
, <n S, 
Education: (CIJE) ” 


Besides processing dbcuments and journal articles, the Clearinghouse 
: ee ; 
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has another major funct ion: (information analysis and synthesis; .The 
tw ign ® | ‘ F 


: Clearinghouse prepares bibl tographies, literature reviewe, state-of-the-art 


papers, and sence tin reports on topics in its area of interest. \ 
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WAGE This bibliography was compiled to provide access to’research and dis- > 


mg eae Oe 5 ‘ 
. cuisedone of the use and misuse - of ‘grade equivagent . scores, i.e., those , 
, ‘ 


‘scores ‘hich reflect a pergon' 8 a on a test or battery of tests 


, acuapaaag ‘to grade norms. It is imited to any educational level, nor’ 


. 
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¢ ttt Eonesnes to any specific curriculum area. Two data bases were ~ 
a by computer. . 
( . A computer search of the ERIC data base yieldedvdocuments announced in 
Resources in Education and journal articles indexed re Current Index to 
Jeers ty Piveatsien which covers over 700 education-related journals. 
Also searched by computer was Psychological Abstracts, an index providing 
summaries of literature in psychology -and related disciplines. Over 800 
pournals,. technical reports, wonographs,’ and other scientific documents 
ate regularly covered. All data fields in both data bases were searched 
for the terms, grade, equivalént and scores. 
The ERIC data base was searched in January 1977. ERIC began collect- 
tne information for RIE in 1966 and for CIJE in 1969. At the time of the 
search, the data base was complete through December 1976. Psychological 
Kintvavta vas searched in ‘nuaes 1977, and the data base dates from 1967. 
+ For ERIC documents (those with .an ED number appearing at the end of 
ae bibliographic citation) the following information is presented when 
. “a 7 Available; Personal or corporate author, ‘title, date of publication, 
number of pages, aad ED number. These documents may be purchased in hard 
copy sa 4a microfiche from the ERIC Document Reproduction cures (EDRS). 4 
- Price information and an order form are appended. However, ERIC micro- 


fiche collections are available at approximately 590 locations throughout 


the country, and most of these collections are open to the public. If you 
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are unable to find a collection in your area, you.may write ERIC/TM for a 


listing. 
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Journal. articles (those entries; appearing, with an EJ number or 
: : j s 


otherwise identified as journals by/ the bibliographic citgtion) are not 
available from EDRS. However, most of these journals are readily avail- 


@ble in college and university libraries as well as some’ large public 
: ot 
libraries. 
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All entries are listed alphabetically-by-author-and are numbered. 


An abstract, or in the case of most journal articles, a shorter annota- 


tion, is provided for each entry. A subject index .consisting of ERIC 


¢ 


descriptors and identifiers reflecting major emphasis is also provided. 
Numbers appearing in the index refer to entries. 
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pages 1774-1775. 


Aleyideino, Samuel C. The Effects of Response Methods aiid Item Typas 
upon’ Working Time Scores and Grade Equivalent Scores of Pupils.Differing: 
in Levels of Aghievement. Dissertation Sbabrscee, Vol. 29, No. 6-A,\1968, 


} 


This study was canducted to investigate anki differential effects 
upon performance on an achievement test battery of two methods. of marking 

responses to test items. Oné method involved the use of a separate anawer 

sheet and the other the use of the test booklet for direct récording of 

responses. The complete battery of the Iowa-Tests of Basic Skills was cH 


.used. In addition to the grade-equivalent scores provided by this battery 


two other scores indicative of rate of/ work were obtained. One such score 
was based -on the time required to reach a point about mid-way through the\ 
test and the other on the time required for the entire test. Contrary 

to expectation, performance as measured in terms of ‘grade equivalent scores 
proved to be more affected by method of marking responses in the case of 
subtests comprised of time-consuming types of items. In general, pupils 
who marked responses directly on the booklet tended to make higher grade- 
equivalent scores than pupils using the separate answer sheet. The only 


’ subtests in which significant interaction effects between response methods 


and ability levels were observed with grade-equivalent scores as the cri- 
terion were the two arithmetic subtests. In both instances pupils at 

higher ability levels profited more from marking their responses directly © h 
on the test booklet. 


Badal,. Alden W. and Larsen, Edwin P. Measurement in Education: On Re- 
porting Test Results to Community Groups. East Lansing, Michigan: National 
Council on Measurement In Education. Special Report, Vol. 1, Now r4, 
May 1970, 12 pages. ED 051 297. Available only from: National Council 
on Measurement in Education, Office of Evaluation Services, Michigan State 
University, East Lansing, Michigan 48823 ($0.50) 

~ 


The majorelements of a test interpretation model which would assist school 
personnel in presenting standardized test information to the public are 
presented. The model is a prototype based upon the testing program used 

in the Oakland, California Public Schools. An outlina and sequence of the | 
test score presentation are suggested, including notes on important back- 
ground concepts. A discussion of test scores as they reflect school } 
needs, and a selection of questions frequently asked by parent and commun-/ 
ity groups are provided. Consideration is given tq/uses of test scorés, 
questions in interpreting test results, types of tests given in schools, 
test norms on comparison groups, types of test scores, summary statistics, 
and suggested data for presentation. Statistical illustrations are provided. 


Berends, Margery Lois. a Analysis of Error Patterns, Rates and Grade 
Equivalent Scores on Selected Reading Measures at Three Levels of Per~ 
formance. Ann Arbor, Michigan: University Microfilms, 1971. 154 pages. e 
ED 066 716-Available only from University Microfilms, Dissertation Copies, 
P.O. Box 1764, Ann Arbor, Michigan 48106 (order No. 72-21.831, M. Film, 

$4.00; Xerography, $10. 00) 


This study examined the reading grade equivalents, oral reading rates, and 
prevailing error patterns of fourth grade disabled readers on standardized 

oral reading tests to determine if there were significant differences in 

the results obtained among the various ifstruments. Errors made on the 

oral paragraphs/stories from the Durrell Analysis of Reading Difficulty, 

the Gates McKillop Reading Diagnostic Tests, and the Standard Reading > 
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“Inventory at each of the three levels of reading performance were analyzed. 
Comparisons of the resulting error patterns were made between tests and 
between levels of performance Most ‘differences in the mean grade equi- 
valent scores were significant; Rates of reading differed significantly 
_between the levels of perfo nce and all but -two between test comparisons 

- were significant. Errors which decreased as the difficulty’level increased - 
were repetitions and corrections. Errors which decréased as the difficulty 
level increased included visual auditory, syllabic division, directional. . “* 
confusion, words aided, medial errors, and egding errors. - The total vision 
perception category and omissions did not change in frequency as difficulty 
level increased. The agreement among the a errors by the three ae : 
tests ‘was highly sngeietemat 


4, Bergsten, Jane Williams. The Effects of Gluaete Gentes in the Normin 
‘ of Achievement Test batteiy. February 1975. 8 pager. 2D O70 125, 

Using the grade equivalent composite. scores: on the Towa Tests of Basic 
Skills of Iowa fourth grade public sdhool’ pupils: who took the tests in 
January 1970, a study was made to determine ‘the relative precision with 
which an estimate could be made of the Sndividual percentile norms from 
different types of cluster sample designs. Five scores ranging from the 
14th to the 93rd percentiles were selected, and the proportions below 
these five scores became the proportions to be estimated. The variances. . 
of the estimates of these five proportions were computed for over. 20 -, » ~ 
different sample designs: results from seven sample designs’are presented — 
Using the error variances that were computed for each of the seven. sample. ° 
designs, the ratio of the error variance based on a cluster sample to the” ~ 
error variance based on a simple random“sample: of pupils was Kacaaee 40 oe 

5. Cooney, G.H. Standardization Procedures Involving Moderator varteblar= 


Some Theoretical Considerations. Australian Journal of Education, Vol. 19, 
No. 1, March 1975, has 50-63. EJ 122024. 


e] 
> e 


Some theoretical igoins relevant to grading wid the transforming. of test - 
scores are considered in this paper; and acomparison is made of; two methods. 
of scaling which use moderator: variahles. 
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6. Davis, John E. Indiana State Council Reading Test Survey. International ” 
Retsing Association; Tatas) State Houstit, 1925. 6 pages. ED ED 112 382, 
rn a. 
A questionnaire was feaaltceciadt to study the uses made of reading tests 
by classroom teachers in Inddana’ with at least one year of experience in 
their respective classrooms. Of the 185 questionnaires distributed by 
local reading councils, 51 questionnaires were retutned. “The teachers 
responding taught in grades one through seven, They’ reported using 37 ~ ye 
\ different tests: 84 percent used a’battery of tests accompanying a basal’ . ++ 
¢ . reading series;* 139 percent ysed reading achievement tests (some teachers | 
uged more than one achievement test); aL. percent used diagnostic tests; . oe 
and 10 percent reported using intelligence tests » reading tests. a I é 
of the responses indicated that most feachers probably use the previou 
year's scores to determine level of reading material’ and group: placement 
and that they ‘interpret grade equivalent ‘scores as representing reading’ 
ability. ‘Grade equivalent scores were,found to‘be the most i Me 
test information in cumulative folders... — "me = 
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,Donlon, Thomas F.- Whase Zoo? Fry's Orangoutang Score Revisited, 
| Reading Teacher, Vol. 25, No. 1, October 1971, pages 7-10. EJ 044190, 


In response ‘to Fry's’ article in the January 1971 Reading Teachet 

. "The Orangoutang Score," this article suggests that chance level. 

Scores should not be assumed to-have arisen only From randgm processes 

,and that there are methods for pee a thoughtful, nonrandom 

’ performance. 

oy , F 

ee ellis, gE.N. Survey of Achievement in Reading in Grade 3 of Vancouver 

Schools, October 26-30, 1970. Vancouver, British Columbia: Vancouver 

Board of School Trustees, April 1971. 6 pages. ED 058 261. : 


A summary of the results of the Metropolitan Upper Primary Reading 

. Test, Form B, which was administered to third grade pupils in Vancouver, 
is presented. Included are tables of local norms and comparisons of : 
the results of, earlier surveys. In.general these pupils performed at 
levels six thonths above the. publishers’ grade equivalent norm in "Word 
Discrimination" and three.months above the publishers/ norm in "Reading . 
Comprehension". These results are consistent with those of earler 
surveys of reading in the primary grades. Because of the large number: 
of perfect scores, i¢ was concluded that the tests were too easy for 
many pupils. 


Ellis, E.N. Survey of Achievement in Reading in Grade 6 of Vancouver 
Schools, November 2-6, 1970. Vancouver, British*Columbia: Vancouver © 


Board of School Trustees, May 1971. 5, pages. ED 058262, 


Results of the Gatés-MacGinitie Reading Test for sixth graders are 
‘summarized, Tables include infarmation for each subtest concerning ° 

Mean scores, percentile ranks, grade equivalent scores, and standard 
* scores. . The level of achievement in this Survey was lower than .that. 
of previous via oe da 


4 
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Fennessey, James. Using Achievement Growth to Analyze Educational Programs. : 
Work Unit 2A. Reward Structures - Achievement Growth. Baltimore, Maryland: — 
Johns Hopkins University, Center for, the Study of Bectad Organization of 
Schools,. on ee 1973. 39 pages. ED 084 306. 3 ( 


\ 


The single’ dost important output of any school is probably the magnitude’ 
of its stud@nts' ‘growth in academic achievement; A variety of standard- 
ized tests has been developed to measure aspectB of this achievement; 
however, only recently have administrators attempted to tse such tests 

‘to help review and make decisions about educational programs. There have 
been such applications of achievement tests retently, as well as associated “4 
- problems. ‘One often unrecognized problem is that for these. program ’ 
analysis applications, it is necessary to develop a score format appropriate 
to the deciston.context,: and one which has the properties) of an interval 
scale.’ Ther® have been some difficulties inherent in past attempts to. 
develop internal scales'of academic achievement; these difficulties carry 
several implications, “With a more open-minded and pragmatic approach 
research and development’ work on some of these issues can be done rather 


_* easily and inexpensively? ‘: 
a? ’ ss 


‘ 


ll. 


might make them unsuitable for, research and evaluation purposes. This 


12. 


* Grade 4, 1637 in Grade 7, and 1665) in Gradé 10. Each of the pupils 


13. 


_ Hathaway, Walter E. The Appropriate and Inappropriate Uses of Grade 


. level equivalent (GLE) scores on standardized tests \of eading and 


- and evaluation personnel in Portland's district has: been that’ national - 
.GLEs are an inadequate and misleading type of score for representing 


Anderson Intelligence Tests, or the Otis Quick-Scoring Mental Ability 
Test. Statistical comparability of verbal and nonverbal scores 


‘vary from test to best, from grade to grade, and with IQ level. - 


. 


_ Series on Evaluation in Education, No. Los Altos, California: — 
RMC _Résearch Corpgration, 1976. 103 =r ED 126 135. For related 


documents, see ED 096 34% afid ED 104 918. Also available from: 


“ issues surrounding use of triterion-versus “norm-referenced tests, descrip- 


- section consists of a 22-step procedure for validating the effective- 


/ Xx ? a =Ga° ‘ : 3 


Level Equivalents "in School Evaluation.. April 1975. 13 pages. ED 109 246. 
Hard copy not available rom EDRS. ; 


. “f 
The Portland Board of Education had ratnseted that the Oregon Central 
Evaluation Department provide student achievement data so as to allow 
compdrisons with other school districts by reporting national grade 


mathematics for grades 4 and 8. For years, the position of most research 


student\ achievement in the district. This position has-been based on 
information about the discrepant meaning of GLEs from/test to test 7 
and also\upon certain technical characteristics of these\scores that - 


paper discusses the advantages, disadvantages, differences in variations, 
interpretations, interpolations and sic sia to reporting GLEs and 
ather’ standardized scores. 

Hieronymus, A. N. and Stroud, James B. Comparability of IQ Scores on 
Five Widely Used Intelligence Tests. Measurement and Evaluation in 
Guidance, Vol. 2, Nos 3, 1969, pages 135-140. 


Data were gathered from 41 Iowa schools systems with 1655 pupils in 


took the Lorge-Thorndike Intelligence Tests and 1 other test, either 
the CTMM, the Henmon-Nelson Test of Mental Ability, the-Kuhlmann- 


were analyzed. Correlations were well below equivalent-forms 
reliability *coefficients, which indicate that the various tests are 
measuring somewhat “different traits. «Differences in comparable IQs, - 


Horst, Donald: P. and Tallmadge, G. Kasten. A Procedural Guide for 
Validating Achievement Gains in Educational Projects. . Monograph 


Superintendent of Documents,.U.S. Government Printing Office,’ 
Sees D.C. 20402 ($2.10) 


The evieitation of this report is that-of identifying educational 
projects which can be considered clearly efemplary. The largest 


ness of educational projects using existing evaluation data. It is 

not intended as a guide for conducting evaluations but rather for 
interpreting data assembled by others using a wide variety of exper- ° 
imental and quasi-experimental designs. ‘As such, its coverage is 

not. restricted to "good" designs. It encompasses all of the commonly 
employed evaluation models, but is not so much concerned with assessing 
the relative usefulness of various designs as with the deficiencies and 
hazards inherent in each of them. It also offers suggestions for 
correcting those results when certain measurement or analysis principles 
have been violated. Included as appendjces are a discussion of the 


tion of the logic and mathematical structures of certain regréssion 


\ 7 - 10 


\ * 


models, and an overview of the hazaras associated with the use of 
percentiles and grade equivalent scores to describe academic performance. 


Jensema, Carl J) A Note on the Achievement ‘Test Scores of Multiple 
Handicapped Hearing Impaired. Stydents. Américan Annals of the Deaf, 
‘Vol. 120, No. 1, February 1975, pages at =39, 
al ; 

xamined Stanford Achievement Test grade equivalent scores of 16,822 
hearing-impaired -students in a nationwide program testing 19,000 

students, 11 subgroups were formed, each subgroup consisting of 

students reported as having 1 of 11 additional handicapping conditions. 

* Mean grade equivalents were calculated for each subgroup to demonstrate, ~ 
as a reminder, that each specific kind of additional handicap tends 
to exert a unique degree of influence on acadenfic achievenent., 


15. Larsen, Edwin P. Opening Institutional Ledger Books--A bates to 
: ‘Educational Leadership: Suggestions for Talking to School-Community 
Groups About Testing and Test Results. TM Report No. 28. Princeton, 


New Jersey: ERIC Clearinghouse on Tests, Measurement and Evaluation, 
December 1974.13 pages ED 099 425- 


_ Three Aey areas are outlined dealing with the developmént of public under- . 
standing of testing: (1) Why tests’are administered in schoois:/ needs 
assessment, instructional program ev4luation, materials selection, re- 
porting to public, documenting individual growth, diagnostic ‘analysis 
and planning, and instructional grouping. (2) Types, of tests used, 
featuring explanations of achievement tests, Scholastic Aptitude: Tests, 
interest tests, specialized aptitude tests, and personality tests. 

(3) Interpretation of test ‘norms, raw scores, grade equivalent scores, 
percentile ranks and stanines, I.Q. scores, and summarizing results - 
(medians and quartiles). Methods used to chart tegt results of a school © 
or district are discussed and suggestions made for the basic tools 

needed, the need for minimum use of numbers, afd the facility of /per- 

‘ centile ranks. Tables and charts are presenting statistical information 
are proposed, and suggestions include highlighting specific skills, 
comparing aptitude and achievement, and charting growth from grade to, 
grade. Finally, in discussing results and school atcountability, the | 
following are proposed: assume leadership--an advocacy- position in iden- 
tifying ‘discrepancies in pupil performance (needs), relate results to 
instructional efforts; discuss, resource needs of the district and school, 
outline noninstructional problems the school and community must address, 
and aepryenate accountability. : : 

16. McElroy, Arthur A. Comparison of Grade Equivalent Scores Among Batteries 

on Two Subtests of the Metropolitan Achievement Test with Educable Mentally 
Retarded Children. Dissertation Abstracts International, Vol. 30, No. 11-A, 
May 1970, pages 4688-4689. ees 


The purpose of this study-was to investigate the stability of the grade 
equivalent scores among batteries of the Metropolitan: Achievement Tests 

when administered to retarded children. It had been. indicated that any 
differences between‘ batteries would be due to differences in the student's © ~ 
abilities rather than in. the tests. This led to the hypotheses of no 
npastaaaaaal differences between the test batteries. The any used a 


; 1) 


- 17." 


"18. 


pingle-dactor repeated measures design. - We Adsendent variable was 

the correct responses converted to\grade equivalent scdyes. "The inde- 
pendent: vartables, were the two subtests. from the three batteries. A 
computet program which, performs~an analysis of variance for a factorial 
désign was used to analyzé the data. Results. upheld the hypothesis that 


- there would-6e\no _significant differences between batteries on the Reading 
‘| Subtest. The s ond hypothesis was partially rejected, due to significant aa 


differences betwe n the Primary II Battery, and the* other ‘batteries on 
the Arithmetic Problem Solving and Concepts Subtest. There were no signi- . 


*, ficant differences, however, between the Elementary and Intermediate e 
_ Batteries on this subtest. These results tended to confirm the stability - 


of the grade equivalent scores among the three ‘batteries of the Metropol- 

itan Achievement Tests. These, results seem to suggest that the lower 

level battery. indicated by. the fental ages and estimated grade ‘levels’ 

ofthe students would: be the. ‘appropriate level to commence testing. The 
next -higher battery ‘would need-to be administered only to those Limited - 

by. a ceiling effect of the lower battery. -Attention should be given to 


’ the Arithmetic Problem Solving ‘and Concepts Subtest to insure correspondence 


between the content taught’ in’ the curriculum and, the factdrs measured by 
the different. batteries. With the lack of,any ‘significant-differences 
between the Elementary Battery and tife Intermediate Battery on either - 
subtest, it would appear prudent -to’ use ‘the ‘Intermediate ta ad only 
when a se et is eiaem by the wees battery. 


Pen, Dallis. | Intgrpreting siatdavdized Test Scores, St. Paul, isnesdbar 
Minnesota. University, Student Counseling Bureau,.1971. 57 pages. ED 053 201. ‘ 


Principles of teat administration, test validity, and accuracy of measure- 
ment underlying interpretation of, standardized tést scores in educational 
administration, instruction, | and guidance are presented. Types of 
norm-referenced Score transformations, including percentiles, standard 
scores, and grade equi avalents, and of criterion referenced scores, . including 
content scales, predicted scores, and expectancies, are destribed; and ‘ 

their appIications are illustrated. Special attention is pide to multi-» 
scores tests and-the representation of their scores as profiles and similar- 
‘ity, indexes. : = 


* s : 

Ricks, James H. On Telling Parents About Test Results. New York City, ' 

New York: Psychological Corporation, December 1959. 4 pages. ED 079 386 . 
: ‘ e 

Two principles and 6ne verbal technique provide a sound basis for commun 
es to Students and their parents the information obtained from testing: 

1) parents have the right to know“whatever the school knows about the 
abilities, performance and problems of their children; (2) the school ‘ 
has the obligation to see that it communicated understandable and usable 
knowledge; and (3) preface the analysis of test results with the phrase 
"you (or your son/daughter) score like pedple who..." Communicating 
test results meaningfully involves attention to content, language and 
audience. IQs should rarely be reported to students or their parents. 
Grade placement scores or standard scores are less like cause trouble, 
but: they ‘require careful explanation. Percentiles probaly a the safest 
and most informative numbers to use provided'it is made’clear’that they 
refer not to the petcent of questions answered correctly but to the per- 
cent of people whose performance the student has equalled or surpassed 
. and provided it is made clear who the people are with whom the student 

is being compared. : 12 


ad 


"19, 


20. 


see 104 918 and ED 126-135. 


Stake, Robert E. Measuring What Learners Learn (With A Special Look 
At Performance Contracting). Urbana, Illinois: Illinois University, 
Center for Instructional Research and Curriculum Evaluatian, 1971. 


41 pages. ED 052 234 


a 


A discussion of performance contracting > defined as an agreement between 


a group offering instruction and a school needing the services, is pre- 
sented. Four major hazards to direct measurement of specific learning 

are cotisidered: poor statement.of objectives; selection of’ the wrong 
tests; misinterpretation of test scores; and depersonalization of contem- 
porary life. “These and other problems such as human and testing error, 
valid criterion testing, and the question of when to test, are discussed 
in full. The relationship of these hazards of performance.measurement ~~ * 
to performance contracting, and to regular school programs, is presented. 


‘Tallmadge, G. Kasten. and Horst, Donald P. A Procedural Guide for Validating 


‘Achievement Gains in Educational Projects. Los Altos, California: RMC 
Research Corporation, May 1974, 89. pages. ED 096 344. For related documents, 


The orientation of thiS report. is that of identifying edutat tonal projects; iad 
which can be considered truly exemplary. The bulk of the report consists -' ~ ' 


. of a 23-step procedure for validating the effectivness of educational programs 


using existing evaluation data. It is not intended as a guide for conducting 
evaluations but rather for interpreting data assembled by others using a. 

wide variety of experimental and quasi-experimental designs; As such, it-, 
coverage is not restricted to "good" designs. ‘ It encompasses ‘all ny commonly 
employed evaluation models. The report is concerned with deficiencies and .' 
hazards of various designs with emphasis on the weaker ones /which, ag fit 


e 


.happens,. are also the most feasible’ in real-world settings, the least costly, 


and the Most commonly used. The appendixes contain project selection criteria ‘ 
worksheets, information regarding fhorm-referenced versus ¢riterion-referenced 


' , teste, estimation of treatment effect from the performance of an initially - 


superior comparison group, effects of noncomparable testing dates on exper- 
imental group versus norm group comparisons, and problems using grade-equi- - 
valent scores in evaluating educational gains. ) pe ; 


Tallmadge, 6. Kasten and Horst, Donald P.: A Procedural Guide for Validatiny 7 


- | Achievement Gains in Educational Projects (Revised). Los Altos, California: 


RMC Research Corporation, December 1974. 100 aia | ED 104 918. For related a4 
documents, see ED 096 344 and ED 126 135. =~ ‘ ‘ vo 

a i bs \ . oi ag - ‘ ie: 
The orientation of this report is that of ident@fying educational: ‘ 
projects which can be considered truly exemplary.” The bulk of the report 
consists of a 23-step procedure for validating the effectiveness of edu- 
cational programs using existing evaluation data. It ig not intended 
as a guide for cofducting evaluations bus ‘rathér ~ interpreting-data ’ 


_ assembled by others using a wide wariety of iexper t¥l and quasi-exper- 


imental designs... As such, its coverage is ndt restricted to "good" designs. 


_It encompasses all of the commonly employed evaluation models. The 


report is concerned with deficiencies and hazards of various designs with 

eaphasis on the weaker ones which, as it happens, are also the most feas- 

ible in real-world settings, the least costly, and the most commonly used. 

The appendixes contain project selection criteria worksheets infor- 

mation regarding norm-referenced versus criterion-referenced tests, esti- 

mation of treatment effect from the performance of sn initially superior 
comparison group, effects of noncomparable testing dates on experimental 

. ,* i 
{ . 
- 13 . 


q 


22. 
e 
a ; 
23. 
. 

. 26, 
© s; 
® 


} ¢ 


\ 
é \ 


: group versus norm group gomparisons, and problems using grade-equivalent 


scores in evaluating edu¢ational gains. Changes from the original version 


-include the removal of material which dealt with project selection criteria 


unrelated to cognitive achievement benefits. 


Tucker, Elizabeth Sulzby. Grade Level Expectations and Grade Level Scores : 
in Reading Tests. 1975. 23 page}. . ED 123 590. , 


Statfstical methods and test writing for reading comprehension have been’ 
based on the assumptidn that, certain reading tasks or levels are appro~ 
priate at one age level that would be too difficult at another. A clear-‘ 


cut determination of grade levels for reading materials has, however; not 


been defined. Grade and age equivalent scores’on silent reading tests, 
readability scores attached to children's reading matter, and reading 
grade expectancy scores are investigated in light of their usefulness. 
The history of the grade equivalent scores used in standardized tests 
and in readability scores can be perceived as reflecting a circular, 

or skyhook relationship between these scores and curricular material. 

A testing procedure consisting of items beginning at a basal level, 
where a student was convinced of his or her mastery, and continuing 
until a ceiling of error is reached can eliminate the inattention factor 
and the guess factor. In addition, such a system can preserve ‘the effi- 
ciency of the group format and make test results more interpretable to 
the specialist, as well as more understandable to the child. 


Wardrop, James L. Was New Century foe hig Students theGates-MacGinitie 
Reading Tests? Urbana, Illinois: llinois University, Center for Instruc- ~ 


tional Research and Curriculum Evaluation, 1971. 10 pages ED 055 067. 
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“Teaching the test" has been defined jn terms of teaching those particular 
content knowledges or skills needed to answer the test items correctly’ ’ 
Evidence of several sorts examined in this paper clearly indicates that, 
New Century was teaching students in Providence, R.I., the Gates-MacGinitie 
Reading Test, which was used to assess their vocabulary achievement. She 
coincidence between vocabulary taught in the instructional package and 

the vocabulary required to régpaad correctly to test items on the Gates- 


. 


‘MacGinitie was determined to be much greater than could be attributed to j 


chance and the data showed that gh ebaching program needed be only 
moderately effective fo improve substantially student gains in grade- 
equivalent scores in the test. On the basis of the-analyses summarized . 
in the paper, if the instructional materials are only 30 percent effective, 
scores a ae average nearly twice those ie would normally be found 


Utsey, Jordan., Simulation In Reading. D 


ember 1966, 13 pages. ED 013 703 
et ¥ ° . as : ° 

An attempt to improve the reliability, validity, and efficiency of all ee 
reading instruction by modifying certain dimensions of teacher, behavior 

is reported, A survey in Oregon indicated tHat to determine the functional © 
reading level of students, 74 percent of the teachers used grade equivalent 
scores from achievement tests, 24 percent used: information frog cumulative 
folders, and 30 pergent used combinations. Materials were developed 

to give prospective teachers an opportunity to learn the marking code of 

the informal reading inventory, to practice, and to evaluate their skill. * 


_=9- 


a 


A series of simulated instructional filma and printed materials was 
devised. The process experienced .by the teachers in three class periods 
is described. One hundred undergraduate students were studied to deter- 
mine the efficiency of the material: ’ The results indicated that teachers, 
after viewing simulated material, were 92 percent accurate in assessing 

“ functional reading level. ‘After revision of the material, a second study 

“ ‘was conducted with 50 subjects. The results indicated 94 percent accuracy. 
A discussion of transfer into actual classroom practice apd references 
are included, 
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